<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>

    <title>Use Regular Expressions</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
  <link href="../../stylesheet.css" type="text/css" rel="stylesheet"/>
<link href="../../page_styles1.css" type="text/css" rel="stylesheet"/>

  


<link href="../../calibreHtmlOutBasicCss.css" type="text/css" rel="stylesheet" />

</head>
<body>

<div class="calibreMeta">
  <div class="calibreMetaTitle">
  
  
    
    <h1>
      <a href="../../../index.html">Sigil User Guide
</a>
    </h1>
    
    
  
  </div>
  <div class="calibreMetaAuthor">
    0.7.2

  </div>
</div>

<div class="calibreMain">

  <div class="calibreEbookContent">
    
      <div class="calibreEbNavTop">
        
          <a href="tutorial_advanced_find.html" class="calibreAPrev">previous page</a>
        

        
          <a href="tutorial_tips.html" class="calibreANext"> next page</a>
        
      </div>
    

    
  <h2 class="calibre5">Regular Expression Reference</h2>

  <p class="h2subheading">— Regex Reference —</p>

  <p class="calibre1">This tutorial is basically a quick reference for Regex.</p>

  <p class="calibre1">Regex is a very powerful way for you to edit and cleanup your EPUB when using Find &amp; Replace. However, when you first start using regex it can be confusing because there are a lot of commands you can use. This is a brief summary of some basic regex commands.</p>

  <p class="calibre1">You can find more details about regex at these links (or ask for help in the Sigil forum):</p>

  <ul class="calibre3">
    <li class="calibre4"><a href="http://www.mobileread.com/forums/showthread.php?t=118569">Regex overview at MobileRead</a></li>

    <li class="calibre4"><a href="http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/">A one page command summary</a></li>

    <li class="calibre4"><a href="http://docs.python.org/library/re.html">Regex as used by Python</a></li>

    <li class="calibre4"><a href="http://www.pcre.org/pcre.txt">Official PCRE details – start at the PCRESYNTAX(3) section</a></li>
  </ul>

  <h3 class="sigilnotintoc" id="regex_overview">Overview</h3>

  <p class="calibre1">When using regex commands in Find &amp; Replace the purpose is to allow you to search for patterns instead of exact text. This means that you can tell Find &amp; Replace to search for "a number" instead of "8", or to match any words between certain tags and then replace it with new tags or new words.</p>

  <p class="calibre1">In order to allow you to search for patterns, you need to use regex commands in place of the specific text you want in your search. For instance, instead of searching for <span class="example">8</span> you could search for <span class="example">\d</span> (which means one digit) or <span class="example">\d+</span> (which means one or more digits).</p>

  <p class="calibre1">An example Find &amp; Replace command using regex might be to change all level 1 headings to level 2 headings:</p>

  <ul class="calibre3">
    <li class="calibre4"><span class="listheading">Find:</span> <span class="example">&lt;h1(.*?)&lt;/h1&gt;</span></li>

    <li class="calibre4"><span class="listheading">Replace:</span> <span class="example">&lt;h2\1&lt;/h2&gt;</span></li>
  </ul>

  <p class="calibre1">where the Find command searches for any text between the h1 start and end tags, and using () to save the text. (The start tag is left open in order to catch any class definitions, and the ? tells .* to do minimal matching). The Replace command is used to replace the matched text with the saved text (indicated by \1) surrounded by the h2 start and end tags.</p>

  <p class="calibre1">The following sections give a sample of common regex commands that you may find useful.</p>

  <h3 class="sigilnotintoc" id="regex_modifiers">General Modifiers:</h3>

  <p class="calibre1">You can put these commands at the start of any regex to modify how the expression behaves:</p>

  <ul class="calibre3">
    <li class="calibre4"><span class="example">(?s)</span> – search across lines when using <span class="example">.*</span> since by default <span class="example">.*</span> only matches within a line of text. This is done by the DotAll option in Find &amp; Replace.</li>

    <li class="calibre4"><span class="example">(?U)</span> – match the first occurrence of your string (also called minimal or ungreedy matching). This is very useful when using <span class="example">.*</span> to avoid matching too much. This is done by the Minimal Match option in Find &amp; Replace.</li>

    <li class="calibre4"><span class="example">(?i)</span> – ignore case when matching.</li>
  </ul>

  <p class="calibre1">You can combine modifiers, e.g. <span class="example">(?sU)</span> is commonly used.</p>

  <h3 class="sigilnotintoc">Single Characters:</h3>

  <p class="calibre1">You can use these regex commands to represent single characters:</p>

  <ul class="calibre3">
    <li class="calibre4"><span class="example">.</span> – matches one character</li>

    <li class="calibre4"><span class="example">\d</span> – match one digit</li>

    <li class="calibre4"><span class="example">\t</span> – match one tab</li>

    <li class="calibre4"><span class="example">[abc]</span> – any character that is a b or c</li>

    <li class="calibre4"><span class="example">[^abc]</span> – any character that is NOT a b or c</li>

    <li class="calibre4"><span class="example">[0-9]</span> – any digit from 0 to 9</li>

    <li class="calibre4"><span class="example">[:alpha:]</span> – any letter, upper or lower case</li>

    <li class="calibre4"><span class="example">[^:alnum:]</span> – any character that is NOT a digit or upper or lowercase letter</li>
  </ul>

  <h3 class="sigilnotintoc">Multiple Characters:</h3>

  <p class="calibre1">These regex quantifiers or codes allow you to represent multiple characters:</p>

  <ul class="calibre3">
    <li class="calibre4"><span class="example">*</span> – use after a character to mean 0 or more, e.g. <span class="example">.*</span> means 0 or more of any character</li>

    <li class="calibre4"><span class="example">+</span> – use after a character to mean 1 or more, e.g. <span class="example">\d+</span> means 1 or more digits</li>

    <li class="calibre4"><span class="example">?</span> – use after a character to mean 0 or 1 instance, e.g. \d? means none or 1 digit</li>

    <li class="calibre4"><span class="example">\s</span> – matches whitespace, including space, tab, end of line carriage returns or newlines</li>
  </ul>

  <p class="calibre1">And there is a special character that tells your search to stop at the first match it finds instead of the largest match which you can use instead of the general modifier <span class="example">(?U)</span>:</p>

  <ul class="calibre3">
    <li class="calibre4"><span class="example">?</span> – use after a quantifier to do a minimal search, e.g. <span class="example">.*?</span> or <span class="example">\d+?</span></li>
  </ul>

  <h3 class="sigilnotintoc">Groups:</h3>

  <p class="calibre1">In some cases you need to group your expressions together so that, for example, you can save them for a later replace, or to indicate a choice of words:</p>

  <ul class="calibre3">
    <li class="calibre4"><span class="example">(atext|btext)</span> – atext or btext</li>

    <li class="calibre4"><span class="example">(group)</span> – group the words together for later retrieval – you can use multiple groups</li>
  </ul>

  <h3 class="sigilnotintoc">Replacing:</h3>

  <p class="calibre1">When replacing text, one of the most useful features is to be able to use text that was matched by storing it in a group. Once you have grouped text you can reference it in your replace command:</p>

  <ul class="calibre3">
    <li class="calibre4"><span class="example">\1</span> – use in Replace to retrieve the value of a saved group (use \2 for the second group, etc.)</li>
  </ul>



  </div>

  
  <div class="calibreToc">
    <h2><a href="../../../index.html"> Table of contents</a></h2>
     <div>
  <ul>
    <li>
      <a href="cover_picture.html">Cover</a>
    </li>
    <li>
      <a href="titlepage.html">Title Page</a>
    </li>
    <li>
      <a href="rights.html">Copyright</a>
    </li>
    <li>
      <a href="introduction.html">Introduction</a>
      <ul>
        <li>
          <a href="whats_new.html">What’s New</a>
        </li>
        <li>
          <a href="about_sigil.html">About Sigil</a>
        </li>
        <li>
          <a href="about_epub.html">About EPUB</a>
        </li>
      </ul>
    </li>
    <li>
      <a href="installation.html">Installation</a>
    </li>
    <li>
      <a href="features.html">Features</a>
      <ul>
        <li>
          <a href="user_interface.html">User Interface</a>
        </li>
        <li>
          <a href="preferences.html">Preferences</a>
        </li>
        <li>
          <a href="opening_and_saving_files.html">Opening and Saving Files</a>
        </li>
        <li>
          <a href="book_view.html">Book View</a>
        </li>
        <li>
          <a href="code_view.html">Code View</a>
        </li>
        <li>
          <a href="preview.html">Preview</a>
        </li>
        <li>
          <a href="book_browser.html">Book Browser</a>
        </li>
        <li>
          <a href="metadata.html">Metadata</a>
        </li>
        <li>
          <a href="add_cover.html">Add Cover</a>
        </li>
        <li>
          <a href="table_of_contents.html">Table of Contents</a>
        </li>
        <li>
          <a href="validation.html">Validation</a>
        </li>
        <li>
          <a href="spellcheck.html">Spellcheck</a>
        </li>
        <li>
          <a href="media_files.html">Images, Video, Audio</a>
        </li>
        <li>
          <a href="special_characters.html">Special Characters</a>
        </li>
        <li>
          <a href="splitting_and_merging.html">Splitting and Merging</a>
        </li>
        <li>
          <a href="find_replace.html">Find &amp; Replace</a>
        </li>
        <li>
          <a href="saved_searches.html">Saved Searches</a>
        </li>
        <li>
          <a href="clips.html">Clips</a>
        </li>
        <li>
          <a href="clipboard_history.html">Clipboard History</a>
        </li>
        <li>
          <a href="links.html">Links and IDs</a>
        </li>
        <li>
          <a href="styles.html">Styles</a>
        </li>
        <li>
          <a href="indexes.html">Indexes</a>
        </li>
        <li>
          <a href="reports.html">Reports</a>
        </li>
        <li>
          <a href="external_editors.html">External Editors</a>
        </li>
      </ul>
    </li>
    <li>
      <a href="tutorials.html">Tutorials</a>
      <ul>
        <li>
          <a href="tutorial_getting_started_with_epub.html">Getting Started With EPUB</a>
        </li>
        <li>
          <a href="tutorial_convert_to_html.html">Prepare Your File For Sigil</a>
        </li>
        <li>
          <a href="tutorial_load_file.html">Open Your File With Sigil</a>
        </li>
        <li>
          <a href="tutorial_save.html">Save Your EPUB File</a>
        </li>
        <li>
          <a href="tutorial_metadata.html">Add the Author and Title</a>
        </li>
        <li>
          <a href="tutorial_add_cover.html">Add a Cover Image</a>
        </li>
        <li>
          <a href="tutorial_split.html">Create Files For Each Chapter</a>
        </li>
        <li>
          <a href="tutorial_mark_chapters.html">Identify Your Chapters</a>
        </li>
        <li>
          <a href="tutorial_generate_toc.html">Generate A Table of Contents</a>
        </li>
        <li>
          <a href="tutorial_links.html">Create Links and Notes</a>
        </li>
        <li>
          <a href="tutorial_code_view.html">A Quick Introduction To Code View</a>
        </li>
        <li>
          <a href="tutorial_spelling.html">Check For Spelling Mistakes</a>
        </li>
        <li>
          <a href="tutorial_validate.html">Check For Errors In Your EPUB</a>
        </li>
        <li>
          <a href="tutorial_find_replace.html">Edit With Find &amp; Replace</a>
        </li>
        <li>
          <a href="tutorial_stylesheets.html">Use Stylesheets</a>
        </li>
        <li>
          <a href="tutorial_embed_fonts.html">Include Custom Fonts</a>
        </li>
        <li>
          <a href="tutorial_advanced_find.html">Advanced Find &amp; Replace – Regex</a>
        </li>
        <li>
          <a href="tutorial_regex_reference.html">Regular Expression Reference</a>
        </li>
        <li>
          <a href="tutorial_tips.html">Tips</a>
        </li>
      </ul>
    </li>
    <li>
      <a href="faq.html">FAQ</a>
      <ul>
        <li>
          <a href="faq.html#faq_common_questions">Common Questions</a>
        </li>
        <li>
          <a href="faq.html#faq_questions">Where to Get Help</a>
        </li>
        <li>
          <a href="faq.html#faq_using_sigil">Using Sigil</a>
        </li>
        <li>
          <a href="faq.html#faq_formatting">Formatting and Styles</a>
        </li>
      </ul>
    </li>
    <li>
      <a href="contributing_to_sigil.html">Contributing to Sigil</a>
    </li>
  </ul>
</div>


  </div>
  

  <div class="calibreEbNav">
    
      <a href="tutorial_advanced_find.html" class="calibreAPrev">previous page</a>
    

    <a href="../../../index.html" class="calibreAHome"> start</a>

    
      <a href="tutorial_tips.html" class="calibreANext"> next page</a>
    
  </div>

</div>

</body>
</html>
